Word Recognition from Continuous Articulatory Movement Time-series Data using Symbolic Representations

نویسندگان

  • Jun Wang
  • Arvind Balasubramanian
  • Luis Mojica de La Vega
  • Jordan R. Green
  • Ashok Samal
  • B. Prabhakaran
چکیده

Although still in experimental stage, articulation-based silent speech interfaces may have significant potential for facilitating oral communication in persons with voice and speech problems. An articulation-based silent speech interface converts articulatory movement information to audible words. The complexity of speech production mechanism (e.g., coarticulation) makes the conversion a formidable problem. In this paper, we reported a novel, real-time algorithm for recognizing words from continuous articulatory movements. This approach differed from prior work in that (1) it focused on word-level, rather than phoneme-level; (2) online segmentation and recognition were conducted at the same time; and (3) a symbolic representation (SAX) was used for data reduction in the original articulatory movement timeseries. A data set of 5,900 isolated word samples of tongue and lip movements was collected using electromagnetic articulograph from eleven English speakers. The average speaker-dependent recognition accuracy was up to 80.00%, with an average latency of 302 miliseconds for each word prediction. The results demonstrated the effectiveness of our approach and its potential for building a real-time articulationbased silent speech interface for clinical applications. The across-speaker variation of the recognition accuracy was discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining acoustic and articulatory feature information for robust speech recognition

The idea of using articulatory representations for automatic speech recognition (ASR) continues to attract much attention in the speech community. Representations which are grouped under the label ‘‘articulatory’’ include articulatory parameters derived by means of acoustic-articulatory transformations (inverse filtering), direct physical measurements or classification scores for pseudo-articul...

متن کامل

Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings

Automatic prediction of articulatory movements from speech or text can be beneficial for many applications such as speech recognition and synthesis. A recent approach has reported stateof-the-art performance in speech-to-articulatory prediction using feed forward neural networks. In this paper, we investigate the feasibility of using bidirectional long short-term memory based recurrent neural n...

متن کامل

Determining an Optimal Set of Flesh Points on Tongue, Lips, and Jaw for Continuous Silent Speech Recognition

Articulatory data have gained increasing interest in speech recognition with or without acoustic data. Electromagnetic articulograph (EMA) is one of the affordable, currently used techniques for tracking the movement of flesh points on articulators (e.g., tongue) during speech. Determining an optimal set of sensors is important for optimizing the clinical applications of EMA data, due to the in...

متن کامل

Continuous speech recognition using articulatory data

In this paper we show that there is measurable information in the articulatory system which can help to disambiguate the acoustic signal. We measure directly the movement of the lips, tongue, jaw, velum and larynx and parameterise this articulatory feature space using principal components analysis. The parameterisation is developed and evaluated using a speaker dependent phone recognition task ...

متن کامل

Conversational speech recognition using acoustic and articulatory input

The combination of multiple speech recognizers based on different signal representations is increasingly attracting interest in the speech community. In previous work we presented a hybrid speech recognition system based on the combination of acoustic and articulatory information which achieved significant word error rate reductions under highly noisy conditions on a small-vocabulary numbers re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013